Performance Evaluation of Stream Log Collection Using HADOOP Distributed File System

نویسنده

  • N. Ramasubramanian
چکیده

Recently stream logging has been referred to widely by web based and product based companies. Stream logging is one of the most important topic of agenda in business re-engineering. Business re-engineering is done in order to improve the effectiveness and productiveness of a particular product or service. Stream logging is achieved with minimum cost using transaction based model over a distributed environment such as HADOOP distributed file system. HADOOP is a distributed file system that helps to improve performance, scalability and reliability. Here a single master and multiple slave model is employed over HADOOP. The proposed model is based on analytics performed by Google for web pages. Here we present a macroscopic analysis of workload characterized by popularity and arrival process. Though numerous transaction models such as Valor, Ameno have been proposed, this model helps to achieve better utilization and execution time over reduced constrained resource. Keywords— thread scheduling, multi-core, kernel, BST-tree, block scheduling

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TidyFS: A Simple and Small Distributed File System

In recent years, there has been an explosion of interest in computing using clusters of commodity, shared nothing computers. In this paper, we describe the design of TidyFS, a simple and small distributed file system that provides the abstractions necessary for data parallel computations on clusters. Similar to other large-scale distributed file systems such as the Google File System (GFS) and ...

متن کامل

An Efficient Design and Implementation of an MdbULPS in a Cloud-Computing Environment

Flexibly expanding the storage capacity required to process a large amount of rapidly increasing unstructured log data is difficult in a conventional computing environment. In addition, implementing a log processing system providing features that categorize and analyze unstructured log data is extremely difficult. To overcome such limitations, we propose and design a MongoDB-based unstructured ...

متن کامل

Hadoop Scalability and Performance Testing in Heterogeneous Clusters

This paper aims to evaluate cluster configurations using Hadoop in order to check parallelization performance and scalability in information retrieval. This evaluation will establish the necessary capabilities that should be taken into account specifically on a Distributed File System (HDFS: Hadoop Distributed File System), from the perspective of storage and indexing techniques, and queriy dis...

متن کامل

Distributed Metadata Management Scheme in HDFS

A Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably and to stream those data sets at high bandwidth to user applications. Metadata management is critical to distributed file system. In HDFS architecture, a single master server manages all metadata, while a number of data servers store file data. This architecture can’t meet the exponentially increased stor...

متن کامل

Towards Efficient Design and Implementation of a Hadoop-based Distributed Video Transcoding System in Cloud Computing Environment

In this paper, we propose a Hadoop-based Distributed Video Transcoding System in a cloud computing environment that transcodes various video codec formats into the MPEG-4 video format. This system provides various types of video content to heterogeneous devices such as smart phones, personal computers, television, and pads. We design and implement the system using the MapReduce framework, which...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013